相机陷阱彻底改变了许多物种的动物研究,这些物种以前由于其栖息地或行为而几乎无法观察到。它们通常是固定在触发时拍摄短序列图像的树上的相机。深度学习有可能克服工作量以根据分类单元或空图像自动化图像分类。但是,标准的深神经网络分类器失败,因为动物通常代表了高清图像的一小部分。这就是为什么我们提出一个名为“弱对象检测”的工作流程,以更快的速度rcnn+fpn适合这一挑战。该模型受到弱监督,因为它仅需要每个图像的动物分类量标签,但不需要任何手动边界框注释。首先,它会使用来自多个帧的运动自动执行弱监督的边界框注释。然后,它使用此薄弱的监督训练更快的RCNN+FPN模型。来自巴布亚新几内亚和密苏里州生物多样性监测活动的两个数据集获得了实验结果,然后在易于重复的测试台上获得了实验结果。
translated by 谷歌翻译
本文介绍了端到端神经视频编解码器AIVC。它基于两个条件自动编码器MNET和CNET,用于运动补偿和编码。AIVC学会通过单端到端率延伸优化使用任何编码配置来压缩视频。此外,它在几个既定的测试条件下与最近的视频编码器HEVC提供了性能竞争。进行了全面的消融研究,以评估组成AIVC的不同模块的好处。该实现可在https://orange-opensource.github.io/aivc/上提供。
translated by 谷歌翻译
Over the past decade, neural networks have been successful at making predictions from biological sequences, especially in the context of regulatory genomics. As in other fields of deep learning, tools have been devised to extract features such as sequence motifs that can explain the predictions made by a trained network. Here we intend to go beyond explainable machine learning and introduce SEISM, a selective inference procedure to test the association between these extracted features and the predicted phenotype. In particular, we discuss how training a one-layer convolutional network is formally equivalent to selecting motifs maximizing some association score. We adapt existing sampling-based selective inference procedures by quantizing this selection over an infinite set to a large but finite grid. Finally, we show that sampling under a specific choice of parameters is sufficient to characterize the composite null hypothesis typically used for selective inference-a result that goes well beyond our particular framework. We illustrate the behavior of our method in terms of calibration, power and speed and discuss its power/speed trade-off with a simpler data-split strategy. SEISM paves the way to an easier analysis of neural networks used in regulatory genomics, and to more powerful methods for genome wide association studies (GWAS).
translated by 谷歌翻译
In intensively managed forests in Europe, where forests are divided into stands of small size and may show heterogeneity within stands, a high spatial resolution (10 - 20 meters) is arguably needed to capture the differences in canopy height. In this work, we developed a deep learning model based on multi-stream remote sensing measurements to create a high-resolution canopy height map over the "Landes de Gascogne" forest in France, a large maritime pine plantation of 13,000 km$^2$ with flat terrain and intensive management. This area is characterized by even-aged and mono-specific stands, of a typical length of a few hundred meters, harvested every 35 to 50 years. Our deep learning U-Net model uses multi-band images from Sentinel-1 and Sentinel-2 with composite time averages as input to predict tree height derived from GEDI waveforms. The evaluation is performed with external validation data from forest inventory plots and a stereo 3D reconstruction model based on Skysat imagery available at specific locations. We trained seven different U-net models based on a combination of Sentinel-1 and Sentinel-2 bands to evaluate the importance of each instrument in the dominant height retrieval. The model outputs allow us to generate a 10 m resolution canopy height map of the whole "Landes de Gascogne" forest area for 2020 with a mean absolute error of 2.02 m on the Test dataset. The best predictions were obtained using all available satellite layers from Sentinel-1 and Sentinel-2 but using only one satellite source also provided good predictions. For all validation datasets in coniferous forests, our model showed better metrics than previous canopy height models available in the same region.
translated by 谷歌翻译
Knowledge Distillation (KD) is a commonly used technique for improving the generalization of compact Pre-trained Language Models (PLMs) on downstream tasks. However, such methods impose the additional burden of training a separate teacher model for every new dataset. Alternatively, one may directly work on the improvement of the optimization procedure of the compact model toward better generalization. Recent works observe that the flatness of the local minimum correlates well with better generalization. In this work, we adapt Stochastic Weight Averaging (SWA), a method encouraging convergence to a flatter minimum, to fine-tuning PLMs. We conduct extensive experiments on various NLP tasks (text classification, question answering, and generation) and different model architectures and demonstrate that our adaptation improves the generalization without extra computation cost. Moreover, we observe that this simple optimization technique is able to outperform the state-of-the-art KD methods for compact models.
translated by 谷歌翻译
This work addresses the problems of (a) designing utilization measurements of trained artificial intelligence (AI) models and (b) explaining how training data are encoded in AI models based on those measurements. The problems are motivated by the lack of explainability of AI models in security and safety critical applications, such as the use of AI models for classification of traffic signs in self-driving cars. We approach the problems by introducing theoretical underpinnings of AI model utilization measurement and understanding patterns in utilization-based class encodings of traffic signs at the level of computation graphs (AI models), subgraphs, and graph nodes. Conceptually, utilization is defined at each graph node (computation unit) of an AI model based on the number and distribution of unique outputs in the space of all possible outputs (tensor-states). In this work, utilization measurements are extracted from AI models, which include poisoned and clean AI models. In contrast to clean AI models, the poisoned AI models were trained with traffic sign images containing systematic, physically realizable, traffic sign modifications (i.e., triggers) to change a correct class label to another label in a presence of such a trigger. We analyze class encodings of such clean and poisoned AI models, and conclude with implications for trojan injection and detection.
translated by 谷歌翻译
White matter bundle segmentation is a cornerstone of modern tractography to study the brain's structural connectivity in domains such as neurological disorders, neurosurgery, and aging. In this study, we present FIESTA (FIber gEneration and bundle Segmentation in Tractography using Autoencoders), a reliable and robust, fully automated, and easily semi-automatically calibrated pipeline based on deep autoencoders that can dissect and fully populate WM bundles. Our framework allows the transition from one anatomical bundle definition to another with marginal calibrating time. This pipeline is built upon FINTA, CINTA, and GESTA methods that demonstrated how autoencoders can be used successfully for streamline filtering, bundling, and streamline generation in tractography. Our proposed method improves bundling coverage by recovering hard-to-track bundles with generative sampling through the latent space seeding of the subject bundle and the atlas bundle. A latent space of streamlines is learned using autoencoder-based modeling combined with contrastive learning. Using an atlas of bundles in standard space (MNI), our proposed method segments new tractograms using the autoencoder latent distance between each tractogram streamline and its closest neighbor bundle in the atlas of bundles. Intra-subject bundle reliability is improved by recovering hard-to-track streamlines, using the autoencoder to generate new streamlines that increase each bundle's spatial coverage while remaining anatomically meaningful. Results show that our method is more reliable than state-of-the-art automated virtual dissection methods such as RecoBundles, RecoBundlesX, TractSeg, White Matter Analysis and XTRACT. Overall, these results show that our framework improves the practicality and usability of current state-of-the-art bundling framework
translated by 谷歌翻译
Accurate diagnosis and prognosis of Alzheimer's disease are crucial to develop new therapies and reduce the associated costs. Recently, with the advances of convolutional neural networks, methods have been proposed to automate these two tasks using structural MRI. However, these methods often suffer from lack of interpretability, generalization, and can be limited in terms of performance. In this paper, we propose a novel deep framework designed to overcome these limitations. Our framework consists of two stages. In the first stage, we propose a deep grading model to extract meaningful features. To enhance the robustness of these features against domain shift, we introduce an innovative collective artificial intelligence strategy for training and evaluating steps. In the second stage, we use a graph convolutional neural network to better capture AD signatures. Our experiments based on 2074 subjects show the competitive performance of our deep framework compared to state-of-the-art methods on different datasets for both AD diagnosis and prognosis.
translated by 谷歌翻译
There are many potential benefits to news readers accessing diverse sources. Modern news aggregators do the hard work of organizing the news, offering readers a plethora of source options, but choosing which source to read remains challenging. We propose a new framework to assist readers in identifying source differences and gaining an understanding of news coverage diversity. The framework is based on the generation of Discord Questions: questions with a diverse answer pool, explicitly illustrating source differences. To assemble a prototype of the framework, we focus on two components: (1) discord question generation, the task of generating questions answered differently by sources, for which we propose an automatic scoring method, and create a model that improves performance from current question generation (QG) methods by 5%, (2) answer consolidation, the task of grouping answers to a question that are semantically similar, for which we collect data and repurpose a method that achieves 81% balanced accuracy on our realistic test set. We illustrate the framework's feasibility through a prototype interface. Even though model performance at discord QG still lags human performance by more than 15%, generated questions are judged to be more interesting than factoid questions and can reveal differences in the level of detail, sentiment, and reasoning of sources in news coverage.
translated by 谷歌翻译
In a fissile material, the inherent multiplicity of neutrons born through induced fissions leads to correlations in their detection statistics. The correlations between neutrons can be used to trace back some characteristics of the fissile material. This technique known as neutron noise analysis has applications in nuclear safeguards or waste identification. It provides a non-destructive examination method for an unknown fissile material. This is an example of an inverse problem where the cause is inferred from observations of the consequences. However, neutron correlation measurements are often noisy because of the stochastic nature of the underlying processes. This makes the resolution of the inverse problem more complex since the measurements are strongly dependent on the material characteristics. A minor change in the material properties can lead to very different outputs. Such an inverse problem is said to be ill-posed. For an ill-posed inverse problem the inverse uncertainty quantification is crucial. Indeed, seemingly low noise in the data can lead to strong uncertainties in the estimation of the material properties. Moreover, the analytical framework commonly used to describe neutron correlations relies on strong physical assumptions and is thus inherently biased. This paper addresses dual goals. Firstly, surrogate models are used to improve neutron correlations predictions and quantify the errors on those predictions. Then, the inverse uncertainty quantification is performed to include the impact of measurement error alongside the residual model bias.
translated by 谷歌翻译